If you’re interested … https://tomelliott.co.nz/phd
my side-project since 2013/14
shifting focus as audience has evolved
pre 2015: school/some university
2015–2019: education (school/university/FutureLearn), and cropping up in unexpected places (around the world)
recently:
Democratisation
See Chris Wild’s talks featuring hits like We Will Plot You)
rapid research development tools
(Andrew Sporle) for organisations with low money/time/both
recent focus on surveys — now handled natively!
key goal is removal of barriers
Data
GUI
Explore
Save output / script
data is from a survey?In R
iNZight isn’t much better … or is it?!
(Remember survey variables never have nice names)
data = "mysurvey.csv"
weights = "wt0"
repweights = "^w[0-4]"
reptype = "JK1"
User doesn’t have to know about the underlying survey design
Researchers can quickly open and explore a (survey) data set
Everything is taken care of
plots (dotplots become histograms, scatter plots become bubble plots or hexbin plots)
summary tables give population counts (plus errors)
data wrangling functions use the correct methods
e.g., survey::subset() for filtering
iNZight is not just a single R package
collection of 9+ ’iNZight*’ packages with specific tasks
‘iNZightPlots’ makes graphs
‘iNZightTools’ provides a suite of utility functions (data wrangling)
main GUI package provides interface and collects user inputs (and displays results)
wrapper functions make programming GUIs much easier — just a case of mapping inputs to arguments
… and allow us to return the behind-the-scenes R code!
library(iNZightTools)
iris_filtered <- filterNumeric(iris, "Sepal.Width", "<", 100)
head(iris_filtered)## Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1 5.1 3.5 1.4 0.2 setosa
## 2 4.9 3.0 1.4 0.2 setosa
## 3 4.7 3.2 1.3 0.2 setosa
## 4 4.6 3.1 1.5 0.2 setosa
## 5 5.0 3.6 1.4 0.2 setosa
## 6 5.4 3.9 1.7 0.4 setosa
## [1] "iris %>% dplyr::filter(Sepal.Width < 100)"
recent work involved modifying wrapper functions to handle surveys
the GUI just needs to pass around a ‘data-thing’ (either data or survey)
library(survey)
data(api, package = "survey")
dclus2 <- svydesign(id = ~dnum+snum,
fpc = ~fpc1+fpc2,
data = apiclus2
)
dclus2_filtered <- filterNumeric(dclus2, "api99", ">=", 700)
code(dclus2_filtered)## [1] "dclus2 %>% srvyr::as_survey() %>% srvyr::filter(api99 >= 700)"
Big thanks to the ‘srvyr’ package!
How does this all relate to my postdoc?
Rourou = basket
Nā tō rourou, nā taku rourou, ka ora ai te iwi.
(With your food basket and my food basket the people will thrive.)
Tātaritanga = analysis
“Tools for analytics and sharing data for the betterment of communities.”
Or: “Informatics for Social Services and Wellbeing”
Improve data standards
Promote Māori data sovereignty
Develop systems to support access
Evaluate synthesising of datasets
Security and privacy implications
Machine learning and AI methods
Improve data standards
Promote Māori data sovereignty
Develop systems to support access
Evaluate synthesising of datasets
Security and privacy implications
Machine learning and AI methods
database connecting data across NZs sectors
high security environment
but also other unnecessary barriers: coding!
many upcoming researchers will have used iNZight at high school or university
no need to learn to code, OR remember how to do things you haven’t done in 2 years
currently working on deploying a demo of iNZight in the Stats NZ data lab — watch this space!
lots of data outside the datalab
many iwi groups, pacific nations, etc. have specific needs for simple (to complex) population summaries/demographic outputs
iNZight means they can do it every 1–2 years without needing to train/retrain/pay expensive statisticians
iNZight also produces code: generate script to re-run/edit as necessary (without having to do all the hard stuff first)
why limit yourself to tables when you can fit hierarchical Bayesian models with model-specific priors, likelihoods, … ?
John Bryant has a set of R packages (dembase, demest, …) for doing Bayesian demography
using them is a bit of a challenge (especially if you don’t do much R coding!)
so we tested out iNZight’s new add-on system …
Both work and ‘fun’
to get access to the IDI, you need to put together a research proposal
putting together a research proposal requires knowing what data is available to investigate
that data is hidden away in the IDI
we put together a simple web app providing a searchable database so prospective (and current) IDI researchers can explore what’s available
build using ReactJS
the display in 302 was broken
so I rebuilt it again, this time using ReactJS + d3
simpler than the last version (no ‘history’ as it just uses real-time data, no backing server)
it’s my goal to, one day, put together a prototype of a new version of iNZight using ReactJS and R-serve
one version that runs on Windows / macOS / Linux / web
plus capability of having a local R server, remote R server - firewall, etc.
Github: tmelliott | iNZightVIT | terourou
Twitter: @tomelliottnz | @iNZightUoA | @terourou
tomelliott.co.nz | inzight.nz | terourou.org